Goto

Collaborating Authors

 turing machine


Constant Bit-size Transformers Are Turing Complete

Neural Information Processing Systems

We prove that any Turing machine running on inputs of arbitrary length can be simulated by a constant bit-size transformer, as long as the context window is sufficiently long. This improves previous works, which require scaling up either the model's precision or the number of parameters on longer inputs. Furthermore, we prove that the complexity class SPACE[s(n)] exactly characterizes the expressive power of a constant bit-size transformer with a context window of length s(n). Our approach relies on simulating Post machines, a Turing-complete computational model. Post machines can be modeled as automata equipped with a queue, exhibiting computational behaviors naturally aligned with those of transformers. The behavioral similarity between transformers and Post machines may offer new insights into the mechanisms underlying the reasoning abilities of transformers.


The World Is Bigger! A Computationally-Embedded Perspective on the Big World Hypothesis

Neural Information Processing Systems

Continual learning is often motivated by the idea, known as the big world hypothesis, that "the world is bigger" than the agent. Recent problem formulations capture this idea by explicitly constraining an agent relative to the environment. These constraints lead to solutions in which the agent continually adapts to best use its limited capacity, rather than converging to a fixed solution. However, explicit constraints can be ad hoc, difficult to incorporate, and may limit the effectiveness of scaling up the agent's capacity. In this paper, we characterize a problem setting in which an agent, regardless of its capacity, is constrained by being embedded in the environment.









Mathematicians spent 2025 exploring the edge of mathematics

New Scientist

In 2025, the edges of mathematics came a little more sharply into view when members of the online Busy Beaver Challenge community closed in on a huge number that threatens to defy the logical underpinnings of the subject. This number is the next in the "Busy Beaver" sequence, a series of ever-larger numbers that emerges from a seemingly simple question - how do we know if a computer program will run forever? To find out, researchers turn to the work of mathematician Alan Turing, who showed that any computer algorithm can be mimicked by imagining a simplified device called a Turing machine. More complex algorithms correspond to Turing machines with larger sets of instructions or, in mathematical parlance, more states. For example BB(1) is 1 and BB(2) is 6, so making the algorithm twice as complex increases its runtime sixfold.